29 research outputs found

    Solving Continuous Control via Q-learning

    Full text link
    While there has been substantial success for solving continuous control with actor-critic methods, simpler critic-only methods such as Q-learning find limited application in the associated high-dimensional action spaces. However, most actor-critic methods come at the cost of added complexity: heuristics for stabilisation, compute requirements and wider hyperparameter search spaces. We show that a simple modification of deep Q-learning largely alleviates these issues. By combining bang-bang action discretization with value decomposition, framing single-agent control as cooperative multi-agent reinforcement learning (MARL), this simple critic-only approach matches performance of state-of-the-art continuous actor-critic methods when learning from features or pixels. We extend classical bandit examples from cooperative MARL to provide intuition for how decoupled critics leverage state information to coordinate joint optimization, and demonstrate surprisingly strong performance across a variety of continuous control tasks

    A Parallel Autonomy Research Platform

    Get PDF
    We present the development of a full-scale “parallel autonomy” research platform including software and hardware. In the parallel autonomy paradigm, the control of the vehicle is shared; the human is still in control of the vehicle, but the autonomy system is always running in the background to prevent accidents. Our holistic approach includes: (1) a driveby-wire conversion method only based on reverse engineering, (2) mounting of relatively inexpensive sensors onto the vehicle, (3) implementation of a localization and mapping system, (4) obstacle detection and (5) a shared controller as well as (6) integration with an advanced autonomy simulation system (Drake) for rapid development and testing. The system can operate in three modes: (a) manual driving, (b) full autonomy, where the system is in complete control of the vehicle and (c) parallel autonomy, where the shared controller is implemented. We present results from extensive testing of a full-scale vehicle on closed tracks that demonstrate these capabilities

    Stochastic Dynamic Games in Belief Space

    No full text

    Semi-Cooperative Control for Autonomous Emergency Vehicles

    No full text
    corecore